Text and Image Metasearch on the Web
نویسندگان
چکیده
As the Web continues to increase in size, the relative coverage of Web search engines is decreasing, and search tools that combine the results of multiple search engines are becoming more valuable. This paper provides details of the text and image metasearch functions of the Inquirus search engine developed at the NEC Research Institute. For text metasearch, we describe features including the use of link information in metasearch, and provide statistics on the usage and performance of Inquirus and the Web search engines. For image metasearch, Inquirus queries multiple image search engines on the Web, downloads the actual images, and creates image thumbnails for display to the user. Inquirus handles image search engines that return direct links to images, and engines that return links to HTML pages. For the engines that return HTML pages, Inquirus analyzes the text on the pages in order to predict which images are most likely to correspond to the query. The individual image search engines tend to excel at different classes of queries, and the combination of engines is surprisingly effective at finding images corresponding to a given query. Both the text and image metasearch functions of Inquirus are surprisingly fast, and we describe the parallel architecture of the engine that provides this efficiency.
منابع مشابه
A World Wide Web Based Image Search Engine Using Text and Image Content Features
Using both text and image content features, a hybrid image retrieval system for Word Wide Web is developed in this paper. We first use a text-based image metasearch engine to retrieve images from the World Wide Web based on the text information on the image host pages to provide an initial image set. Because of the high-speed and low cost nature of the text-based approach, we can easily retriev...
متن کاملDetection of Heterogeneities in a Multiple Text Database Environment
As the number of text retrieval systems (search engines) grows rapidly on the World Wide Web, there is an increasing need to build search brokers (metasearch engines) on top of them. Often, the task of building an eeective and eecient metasearch engine is hindered by the heterogeneities among the underlying local search engines. In this paper, we rst analyze the impact of various heterogeneitie...
متن کاملCaptain Nemo: A Metasearch Engine with Personalized Hierarchical Search Space
Personalization of search has gained a lot of publicity the last years. Personalization features in search and metasearch engines are a follow-up to the research done. On the other hand, text categorization methods have been successfully applied to document collections. Specifically, text categorization methods can support the task of classifying Web content in thematic hierarchies. Combining t...
متن کاملText Documents
The World Wide Web has become the largest information source in recent years, and search engines are indispensable tools for finding needed information from the Web. While modern search engine technology has its roots in text/information retrieval techniques, it also consists of solutions to unique problems arising from the Web such as web page crawling and utilizing linkage information to impr...
متن کاملAnalyse de la robustesse des algorithmes de méta-recherche discriminante
This paper studies the sensitivity of four metasearch engines under different situations. The focus of this analysis is on trainable metasearch engines. Our main contribution is a large scale systematic analysis of the performance and behavior of these methods on several corpora. Firstly, we analyze how the choice and normalization of the relevance score delivered by base search engines influen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999